G : Bandits , Experts and Games 09 / 19 / 16 Lecture 3 : Lower Bounds for Bandit Algorithms
نویسندگان
چکیده
Note that (2) implies (1) since: if regret is high in expectation over problem instances, then there exists at least one problem instance with high regret. Also, (1) implies (2) if |F| is a constant. This can be seen as follows: suppose we know that for any algorithm we have high regret (say H) with one problem instance in F and low regret with all other instances in F , then, taking a uniform distribution over F , we can say that any algorithm has expected regret at least H/|F|. (So this argument breaks if |F| is large.) If we prove a stronger version of (1) that says that for any algorithm, regret is high for a constant fraction of the problem instances in F , then, considering a uniform distribution over F , this implies (2) regardless of whether |F| is large or not. In this lecture, for proving lower bounds, we consider 0-1 rewards and the following family of problem instances (with fixed to be adjusted in the analysis):
منابع مشابه
G : Bandits , Experts and Games 09 / 12 / 16 Lecture 4 : Lower Bounds ( ending ) ; Thompson Sampling
Here is a parameter to be adjusted in the analysis. Recall that K is the number of arms. We considered a “bandits with predictions” problem, and proved that it is impossible to make an accurate prediction with high probability if the time horizon is too small, regardless of what bandit algorithm we use to explore and make the prediction. In fact, we proved it for at least a third of problem ins...
متن کاملG : Bandits , Experts and Games 10 / 10 / 16 Lecture 6 : Lipschitz Bandits
Motivation: similarity between arms. In various bandit problems, we may have information on similarity between arms, in the sense that ‘similar’ arms have similar expected rewards. For example, arms can correspond to “items” (e.g., documents) with feature vectors, and similarity can be expressed as some notion of distance between feature vectors. Another example would be the dynamic pricing pro...
متن کاملBandit-Based Estimation of Distribution Algorithms for Noisy Optimization: Rigorous Runtime Analysis
We show complexity bounds for noisy optimization, in frameworks in which noise is stronger than in previously published papers[19]. We also propose an algorithm based on bandits (variants of [16]) that reaches the bound within logarithmic factors. We emphasize the differences with empirical derived published algorithms.
متن کاملContextual Bandits with Stochastic Experts
We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem. In our problem setting, we assume access to a class of stochastic experts, where each expert is a conditional distribution over the arms given a context. We propose upper-confidence bound (UCB) algorithms for this problem, which employ...
متن کاملStochastic and Adversarial Combinatorial Bandits
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting, we first derive problemspecific regret lower bounds, and analyze how these bounds scale with the dimension of the decision space. We then propose COMBUCB, algorithms that efficiently exploit the combinatorial structure of the problem, and derive finitetime upper bound on thei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016